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ABSTRACT 





Fermentative production of chlortetracycline is a complex fed-batch bioprocess. It generally takes over 90 h for 
cultivation and is often contaminated by undesired microorganisms. Once the fermentation system is contami- 
nated to certain extent, the product quality and yield will be seriously affected, leading to a substantial economic 
loss. Using information fusion based on the Dezer-Smarandache theory, self-recursive wavelet neural network 
and unscented kalman filter, a novel method for online prediction of contamination is developed. All state vari- 
ables of culture process involving easy-to-measure and difficult-to-measure variables commonly obtained with 
soft-sensors present their contamination symptoms. By extracting and fusing latent information from the chang- 
ing trend of each variable, integral and accurate prediction results for contamination can be achieved. This makes 
preventive and corrective measures be taken promptly. The field experimental results show that the method can 


Self-recursive wavelet neural network 
Unscented kalman filter 


be used to detect the contamination in time, reducing production loss and enhancing economic efficiency. 
© 2015 The Chemical Industry and Engineering Society of China, and Chemical Industry Press. All rights reserved. 





1. Introduction 


Chlortetracycline (CTC) is an important antibiotic and a secondary 
metabolite of Streptomyces aureofaciens. It is characterized by bacterial 
inhibition, promotion of animal growth, high availability in animal 
feed, minimal residue in animal tissues, and low production cost. In 
recent decades, CTC has been the most consumed antibiotic in animal 
feed industry [1]. The most important considerations for biochemical 
industry are high yield, quality, and profit. Therefore, research efforts 
have often focused on in-depth physiological characteristics of cells to 
optimize industrial production. However, the risk of bacterial contami- 
nation is inevitable despite the maintenance of strict aseptic conditions 
during the production process. The biochemical plant we surveyed loses 
nearly 20 million CNY annually due to contamination-related issues. 
Therefore, the detection and prevention of bacterial contamination 
have been an active research area for the last two decades. 

Contamination is defined by the migration of an undesired microor- 
ganism along with the desired microorganism, which affects the normal 
growth of the latter. These fast-growing, bacterial contaminants and 
phages soon outnumber culture strains, producing large amounts of 
byproducts, severely inhibiting the growth and metabolism of the 
culture strain of interest. Furthermore, a large proportion of nutrients, 
especially glucose, for supporting the growth and CTC production by 
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the culture strain, are diverted to the contaminants. Additionally, 
destruction of the culture strain will lead to disastrous consequences 
for the CTC plant. 

Thus, early and accurate prediction of contamination of culture 
broth is of vital importance to biological fermentation and several 
methods are available for detecting or evaluating the contamination in 
laboratory or large-scale fermentation plant. These are broadly classi- 
fied into physical and biological methods. The use of physical methods 
such as light, radiometry, and chromatography facilitates the rapid, 
precise, and non-invasive evaluation of broth [2-4]. However, their 
disadvantages lie in the cost of equipment, which are very expensive 
and require a high level of maintenance, and the tests are time- 
consuming and do not allow online application. On the other hand, bio- 
logical methods, which exploit the genetic, immunological, and morpho- 
logical characteristics of microorganisms [5-7], afford high accuracy, but 
the requirement of operator expertise and time-consuming procedures 
for prediction present significant drawbacks. In contrast, the soft-sensor 
prediction method that works on the principle of cause and effect reveals 
the intrinsic biological relation between measured and unmeasured 
States, and has been employed by several investigators [8,9]. This system 
generates data-driven black-box models on the basis of data from history 
of fermentation batches and captures underlying changes in process state, 
judging whether the broth is contaminated [10,11]. Several popular 
approaches such as principle component analysis, partial least squares, 
and clustering have been used as references in recent literature on process 
monitoring for detecting and diagnosing errors in the culture process 
[12-15]. Multisensor data fusion is widely applied in sensor networks, 
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robotics, video and image processing, and intelligent system design, 
combining information from several sources to form a unified picture. 
It introduces a novel approach to information processing and our 
work is an offshoot of this idea. Each variable state obtained from the 
CTC fermentation process is considered as a frame of discernment, 
namely source evidence, in which three elementary propositions are 
defined, which constitute seven focal elements with union operators 
[16-19]. Primitive contamination information is represented as a mass 
function of seven focal elements for each source, explored from the latent 
information, using the self-recursive wavelet neural network (SRWNN) 
calculation [20]. Next, by virtue of the Dezer-Smarandache theory 
(DSmT) methodology, all contamination information can be integrated 
into a comprehensive decision that facilitates necessary preventive and 
corrective measures. 

In this work, we introduce the concept of DSmT, the principle of 
SRWNN and kalman filter (UKF filter) algorithms [21]. Specific proce- 
dure is presented to achieve online prediction of contamination in the 
culture process. The proposed method can be employed in industrial 
scale CTC plants. 


2. Preliminaries 
2.1. Dezert-Smarandache theory 


The DSmT of plausible and paradoxical reasoning overcomes inher- 
ent limitations of the classical Dempster-Shafer theory (DST), which is 
based on the refutation of the principle of the third excluded middle, 
and is the generalization that DST can formally combine different infor- 
mation sources (rational, uncertain, or paradoxical). Owing to the vague, 
relative, and imprecise nature of the hyper-powerset D® of the general 
frame of discernment 0, the DSmT can solve these complex fusion prob- 
lems where the DST or other methods often fail, especially when con- 
flicts between sources become strong and the refinement of the frame 
of discernment @ is inaccessible. 

To understand the algorithm-based DSmT, three important concepts 
are introduced briefly, namely, hyper-powerset D®, generalized basic 
belief mass, and proportional conflict redistribution rule. 

The cornerstone of the DSmT is the concept of hyper-powerset D®. In 
order to fuse information, one defines the frame of discernment 6, 
representing a source set of n finite elements, 0 = {64, 65, ..., On}, 
where 6; represents a concrete hypothesis, which is impossible to be 
defined and separated precisely. Here, D® is considered as the set of all 
propositions from 0 with N and U operators, and these propositions 
must satisfy the following three conditions: (i) @, 04, 02, ..., #n = D®, 
(ii) if A, B = D®°, then ANB € D® and AUB & D®, and (iii) no other 
elements belong to D®, except those obtained by using rule 1 or 2. 

The second concept is of generalized basic belief mass. For every 
evidential source 0 of the frame of discernment, mapping m(-): 
D®° — [0, 1] associated to it is defined, which satisfies the following 
condition 


m(@)=0 and S— m(A)=1. (1) 


A& D® 


Considering the inherent nature of element 6;,, it is possible that the 
non-exclusive and non-refinement elements of 0 turn into a new, finer, 
exclusive frame of discernment. Quantity m(A) represents the level of 
trust for proposition ‘A’ and the support to ‘A’ directly. Mapping m(- ) 
is referred to as a generalized basic belief mass (gbbm). 

The crux of the proposed method is the proportional conflict redis- 
tribution (PCR) rule. PCR can be applied to DST and DSmT framework 
dealing with the combination of belief functions and working for any 
degree of conflict under static or dynamical fusion situations. PCR rule 
redistributes the partial conflicting mass to the elements involved in 
the partial conflict, considering the conjunctive normal form of the 
partial conflict. PCR is considered as the most mathematically exact 


redistribution of conflicting mass to non-empty sets following the 
logic of the conjunctive rule. PCR redistributes the conflicting mass 
only to the sets involved in the conflict and proportionally to their 
masses placed in the conflict. The general PCR formula for s > 2 sources 
is given by [16]. For mpcr(@) = 0 and VX € G/{@} 


Mpcr(X) = M12...5 + S- »s (2) 
dies X yen, EG {X} 
i ee Carer d PEPE, ott) 
ES op gee Pe vee 
glass teen We ¢(XNX;,1...X;,) = @ 
{is tees PEP Cla ceg Sh) 


Tia, (x)’| | Migr +1 Mi, (x;,)| 
[M1 (X)| o beeen (x;,)| 


where G corresponds to a constrained hyper-power set D®; i, j, k, r,s and 
t are all integers; m12._,;=m,(X) corresponds to the conjunctive consen- 
sus on X between s sources, where all denominators are not equal to 
zero; the set of all subsets of k elements from {1, 2, ...,n} (permutations 
of n elements taken by k) is denoted as P*({1, ...,}) and the order of 
elements does not count; c(X) is the canonical form (conjunctive normal 
form) of X. 


2.2. SRWNN model for gbbm 


The SRWNN model, which combines the attractor dynamic property 
of recurrent neural networks and good convergence performance of 
wavelet neural networks, can deal with time-varying input or output 
and shows good identification performance. In this section, taking the 
fermentation process into account, we briefly describe the application 
of SRWNN to achieve gbbm value for input variables of CTC fermenta- 
tion process. Firstly, we assume that there are only three hypotheses 
in terms of each state variable available from the culture process, that 
is, 0 = {64, 05, 03}, referred to as a frame of discernment. Next, we formu- 
late hyper-powerset D® by building it from the elements of @ with 
operators U and f. In order to decrease the complexity of calculation, 
we assume that D® contains the following composite propositions: 
X1 = 01, X2 = 02, X3 = 43, X4 = 8,U 82, X5 = 0,U 03, Xo = 02U 43, and 
X7 = 0,U0,U0@3. Meanwhile, the focal elements from D® satisfy the 


following constraint condition: >> m(X;) = 1, where the quantity 
X; = D® 


m(X;) is a gbbm of X;. Thus, the seven gbbm values, reduced for each 
source of evidence, can be computed based on SRWNN structure and 
least squared error-based learning algorithm. A schematic diagram of 
the SRWNN structure is shown in Fig. 1, where No = wavelets. The 
SRWNN structure consists of four layers: input layer, mother wavelet 
layer with a self-feedback loop, wavelet layer, and output layer. The 
details for formulation and calculation of SRWNN have been described 
previously [20]. 


2.3. UKF algorithm for smoothing 


The collected signals generated from CTC fermentation process are 
susceptible to the environment for various reasons. Accurate and reliable 
results of desired contamination information rely on removal of noise 
from the sampled primitive signals. 

Normally, the nonlinear discrete-time system considered is of 
the form 


fe + 1) = f[x(k)] + w(k) (3) 
y(k) = H[x(k)] + v(k) 


where k denotes discrete time, k < No (No denotes the set of natural 
numbers including zero), x(k) < R" is the state vector, and y(k) € R™ 
is the measurement vector; the nonlinear mapping f(- ) and H(-) are 
assumed to be continuously differentiable with respect to x(k); v(k) 
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Layer3 


Layer4 





Fig. 1. Structure of SRWNN. 


and w(k) are system state noise and output noise, respectively. Similar 
to Eq. (3), the UKF algorithm is considered as the most suitable filter 
algorithm for CTC fermentation process due to its ability to approximate 
nonlinear process and measurement model. The rationale behind the 
UKF algorithm, based on unscented transformation, is to use a minimal 
set of sample points to capture the true mean and covariance of nonlin- 
ear process, and then estimate the posterior mean and covariance with 
errors introduced in the second or higher orders when the set of sample 
points is run through the nonlinear system. The specific use of UKF algo- 
rithm as a nonlinear filter has been reported previously [21]. 


3. Procedures 


Development of an estimator system to detect contamination during 
CTC fermentation is explained in this section. The available variables, 
including online and offline, of the CTC fermentation process are as 
follows: temperature (TM), dissolved oxygen (DO), agitator current 
intensity (CI), ammonia accumulation (AA), glucose accumulation (GA), 
liquid volume (LV), air flow accumulation (AF), carbon dioxide concen- 
tration in exhaust (CO), fermentation time (T;), amino nitrogen concen- 
tration (AC), viscosity of culture broth (VS), titer of CTC (TI), and glucose 
concentration (GC). The essence of the proposed method is to combine 
all contamination information obtained from online sensors and soft- 
sensors into an accurate decision. A schematic diagram for realizing 
this method is illustrated in Fig. 2. The following sections describe in 
detail the procedure to preprocess historical data, establish the SRWNN 
model, detect unmeasured but important variables with a soft-sensor, 
fuse all contamination information with DSmT, and take the required 
decision. 


3.1. Data preprocessing 


The intensive, data-driven nature of the proposed method requires a 
sufficient amount of data. The preliminary work, which includes gather- 
ing, arrangement, and normalization of data, is crucial for building a 
robust and accurate model. Firstly, the selected batch data must be com- 
plete, without missing any key state variables, and the duration range 
should cover the entire fermentation process, especially for the normal 
process data set. To facilitate subsequent application, all batch data 
sets are normalized to the O-1 range and classified into three groups, 
i.e., normal process data set, Bacillus infection data set, and phage infec- 
tion data set (the first part of Fig. 2). Meanwhile, all data in the three 
groups are filtered using the UKF algorithm. Finally, 150 batches of typ- 
ical data set, which covered the whole year, are deliberately selected 


from the preprocessed data set to constitute the training set of 120 
batches and the test set of 30 batches. 


3.2. Measuring difficult-to-measure variables with soft-sensor 


Invariably, there are some key measurements and quantities 
reflecting cellular metabolism, energy transaction, microbial growth, 
product yield, and so on that cannot be simply measured online by an 
instrument, due to the unavailability of the instrument, high cost of 
hardware sensors and their maintenance, or the reliability of the 
sensors. An alternative is to measure them by laboratory analysis, by 
Sampling the culture broth during the fermentation process. However, 
this procedure is time-consuming and arduous, which in turn increases 
the cost of production, high off-spec products, and risk to environment 
Safety. To solve the problem, the alternative is to use a soft-sensor that 
can measure and predict important variables difficult to measure phys- 
ically. Using these variables measured with soft-sensor, four evidence 
sources, indirectly reflecting the contamination of CTC fermentation, 
can be obtained. Based on the proposed method, four soft-sensor 
units have been built to measure the viscosity of culture broth, titer 
of CTC, amino nitrogen concentration, and glucose concentration. To 
enhance the reliability of the soft-sensor, correction units have been 
developed for the four variables, so that the model parameters showing 
a difference between the value from laboratory analysis and that from 
the soft-sensor can be corrected [22]. Fig. 3 depicts the working principle 
of the four variables with soft-sensor, in which inputs of the model 
include known, online, continuous variables. The results generated from 
soft-sensors are real-time and change during the CTC fermentation 
process. 


3.3. Fusing information based on DSmT 


Whether the CTC fermentation process is contaminated cannot be 
read directly by measuring the culture broth online. However, some 
information indicating that broth state may contain Bacillus or phages 
can be obtained by comparing the current trend of process variables 
with the normal control at the same time point. During the culture 
process, different state variables have varying susceptibility or response 
time for the same source of contamination and this is an inherent 
feature of CTC fermentation characteristics. It is recommended to com- 
bine all available information from state variables to capture accurate 
and comprehensive information. In terms of fusing uncertain, imprecise, 
and conflicting information, the DSmT methodology is more advanta- 
geous than DST. For the convenience of combining contamination infor- 
mation, each state variable is considered to have the same frame of 
discernment, where three hypothesis elements exist within a frame. We 
then define a mapping set associated with each source of evidence and 
construct the gbbm as follows: m(6,) is defined as the gbbm for non- 
contamination, denoted by m(X,); m(@2) is defined as the gbbm for 
definite Bacillus contamination, denoted by m(X2); m(@3) is defined as 
the gbbm for definite phage infection, denoted by m(X3); m(0,U62) is 
defined as the gbbm for probable Bacillus contamination, denoted by 
m(X4); m(@,;U63) is defined as the gbbm for probable phages infection, 
denoted by m(X5); m(62U 63) is defined as the gbbm for probable Bacillus 
contamination or phages infection, denoted by m(X.); and m(6,U62U43) 
is defined as the gbbm for probable Bacillus contamination and phage 
infection, denoted by m(X7). Based on the SRWNN method, the model 
structure relating each input process variable to the seven gbbms is 
formulated. The overall procedure adapts a methodology of trial and 
error, not stopping the tests and continually revising until the perfor- 
mance requirement of the application is met (see the second part of 
Fig. 2). 

In terms of the CTC fermentation process, 12 process state variables 
exist in all, which consists of the online and offline variables, each of 
which is considered as a body of evidence for contamination. To find 
some symptom of contamination from those process state quantities, 
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Fig. 2. Implement flowchart of the proposed method. 


it is necessary to combine information from these 12 diverse variables, 
making a final judgment or decision for the next specific steps. The illustra- 
tion of information fusion architecture is shown in the third part of Fig. 2. 

Finally, the information for general basic belief mass of every process 
variable generated from SRWNN is combined and normalized. The input 
value of PCR begins to be interpreted. A schematic diagram of the DSmT 
neural network structure (DSmTNN), given in Fig. 4, has two evidence 
sources, 14 inputs, and seven normalized outputs, and is composed of 
four layers. 

Layer 1 is the input layer: each input node corresponds to the gbbm 
of a focal element of a single source. The input layer accepts the gbbm 
values and transmits them to layer 2. 

Layer 2 is the multiplication layer: each node performs multiplication 
for the two incoming masses from layer 1. For instance, node k has 
two input masses, m?(X;) and m§(X;), and it produces its output, M;, = 
m{(X;) x m§(X;), so the number of multiplications in layer 3 is i x j. 

Layer 3 is the summation layer: it consists of seven sum nodes, each 
adding its respective incoming mass from layer 2. Each node output 
corresponds to the non-normalized combined mass as 


my2(X)= >) OM, (4) 
X;NX j=X 


where a = 1 if XN.X; = X, else a = 0. 
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Fig. 3. Topological structure of soft-sensor. 


Layer 4 is the normalization layer: in DSmT, if Card(0) = 10! = 3, 
then |D°| = 19. Obviously, this size is too big to establish a moderate 
model structure and compute the fusion information accurately. To 
simplify the model structure and decrease the operational burden, we 
assume that all focal elements in this study are void except the seven 
combination relationships between the three propositions in the 
frame of discernment. Consequently, in the fourth layer, a normaliza- 
tion method is used to bring the values that come from the summation 
layer, lying outside the boundary, within the range of 0 to 1. The output 
of the mth node is the following ratio, which denotes the normalized 
combined mass 


M4 (X;) 


m2(X;) = 7 
S| M49(Xj). 
i=1 


(3) 


Per part 4 of Fig. 2, performing the initial fusion of output informa- 
tion from part 3 of Fig. 2 by means of DSmTNN results in a six-group 
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Fig. 4. Cascade DSmTNN model for information fusion. 
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primary result: {VS, TI}, {AC, GC}, {DO, AA},{CI, TM}, {GA, AF}, and {CO, 
LV}. Among these six data pairs, the latter two are obtained from soft- 
sensor data, while the remaining four are real-time data pairs. Each pair 
result would require further information fusion, in the form of two new 
evidences. Thus the result is more accurate and reliable than the initial 
outcome. 

Sometimes, the primary result may present some conflict if it is 
determined solely from the gbbm values. In view of this challenge, PCR 
is considered the best combination rule to synthesize relative information, 
which then transfers (total and partial) conflicting mass to non-empty sets 
involved in the conflicts proportional to the mass assigned to them by the 
source. PCR includes five versions, PCR1 through PCRS, with increasing 
complexity of rules and precision of redistributing conflicting masses. 

The method to combine the primary fusion information using PCR5 
is described in brief. First, the conjunctive rule as follows is applied: 


{17142(X1), ..-, M12(X7) }. 

m(X)= > m,(X)m,(X;) (6) 
X;,X;<D° 
XinX, =X 


Next, with the following PCR5 formula, the set of {mpcprs5(X1), ..., 
Mpcrs(X7)} may be obtained. 


bitin) Wnt OO 8 
X; G(X} | M4 (X;) + My (x;) m )+m ms(X;) +m, (X;) 
(7) 


Mpcrs (Xj) = 42 (Xj) + 


In the proposed method, applying the PCR5 rule repeatedly may 
yield the combination information of both online and offline data. 
Owing to the difference between the reliability of the two data types 
by virtue of the nature of the CTC fermentation plant, the reliability 
weightage should be added to the fusion information before making 
the final decision. Therefore, the finial fusion result between the online 
and offline data can be obtained as follows. 


S- [m?™(x,)] fm" (X))]P (8) 


m(X) = 


where a and £ are all statistical determinants, satisfying a + B = 1, 
while in general a < £. 


3.4. Making the final decision 


Once the previous steps in the information combination procedure 
are complete, obtaining six pairs of initial combination based on the 
DSmTNN, four groups of secondary combination base using PCR5, and 
a final weighted combination, an algorithm are chosen to make the 
final decision in the form of probability. The Pignistic algorithm is 
selected for this purpose, representing intuitive and operable properties 
of probability as follows: 


VAED®, P{A}= 37 ae mx) (9) 
X © D® 


where IX! is the cardinal of proposition X in the DSm model. With the 
improvement via Eq. (10), each gbbm is calculated according to the 
sequence {0,, 62, 03} to obtain {P{6,}, P{6>}, P{@3}}. 


P {0,} = m(0,) + 1/2m(,U8,) + 1/2m(6,U 
P {0} = m(0q) + 1/2m(0, LO) + 1/2m(0,U03) + 1/3m(0;L0,U8s) 
P {63} = m(03) + 1/2m(6,U03) + 1/2m(0,U83) + 1/3m(0,U8,U63) 

(10) 


63) + 1/3m(6,U0,U6; 


Then, the normalized set {P{6,}, P{@2}, P{@3}} is computed as follows. 


P {6;} 


SP (a) 


Pola (11) 


From an application point of view, the set of {P{6,}, P{@2}, P{@3}} 
provides the contamination information of the CTC fermentation 
process expressed as a probability, as described in part 5, Fig. 2. Thus, 
if a key constraint or condition goes beyond the predefined limitation 
of contamination, the system will warn the operator. Based on the 
experienced worker's judgment using the information, preventive and 
corrective measures can be taken in time. 


3.5. Results and discussion 


In the CTC fermentation process, several challenges, such as mechan- 
ical failures, process disruption, operational or instrument errors, contrib- 
ute to data records of contamination. A breach of aseptic conditions in any 
part of the operation would expose the fermentation to a high risk of 
infection by undesired microorganism. Before discussing the eventuality 
of contamination, we first identify the issues that can result in infection 
using a flowchart of CTC culture procedure, illustrated in Fig. 5. The 
flowchart contains four phases including strain preparation, primary 
seed amplification, secondary seed amplification, and final fermentation 
process. During these steps, the likelihood of introducing harmful 
microorganisms into production process involves both situations and 
equipment, namely, leakage in the pipes carrying sterile air, agitator 
malfunction, failure of gasket or o-ring valves, deviation from vessel 
and media sterilization procedures, contamination during initial or 
mid-cycle inoculations, contamination during tank-to-tank transfers, 
contamination during offline collection of broth at the sampling port, 
failure of exhaust outlet fan, and contamination of water, air, defoamer, 
and so on. These causes of contamination in the CTC fermentation plant 
have been determined based on the data obtained from actual plant 
operations, but several other undetectable phenomena may occur as well. 

The proposed scheme based on DSmT was carried out experimentally 
to predict contamination occurrence online and make corrective measures 
ina 130 m° fermenter of the CTC plant. We divide the experimental results 
into three types of situations as those of microbial infection. Figs. 6-8 
present three curves for the normal state, denoted as ‘Normal’, Bacillus 
infection, denoted as ‘Bacilli’, and phage infection, denoted as ‘Phage’, 
where the horizontal axis denotes the duration of culture process while 
the vertical axis denotes the probability percentage. In the following sec- 
tion, we will elaborate on the typical distribution for the three trajectories. 

The normal fed-batch process plot shows that the culture process 
is either contaminated or the extent of contamination within the 
predefined, lowest threshold of detection. In Fig. 6, ‘Normal’ trajectory 
is over 60% though it gets close to 50% at approximately 37 h, but lasts 
for less than 3 h for a single batch. The other two representative trajecto- 
ries fluctuate independently by approximately 20% from the beginning to 
end. Thus the culture process can be considered as a normal batch only if 
the percentage of the ‘Normal’ curve is larger than 50%. 

Fig. 7 illustrates the probability variation of the three curves once 
the fermentation process is contaminated, i.e., the broth is infected by 
a Bacillus species, which is the most common and infectious contam- 
inant of the CTC fermentation. Therefore, ‘Bacilli’ is considered as a 
representative contaminant, outside of the phages. 

In Fig. 7, the average probability of ‘Bacilli’ trajectory exceeds the low- 
est threshold by approximately 40% at about 40 h, which is predefined as 
the determination condition of contamination, while the trajectory of 
‘Normal’ is less than 50% and that of ‘Phage’ is at 15%. This result indicates 
that the overgrowth of Bacillus would threaten the fermentation strain 
so that necessary measures should be taken to inhibit the growth of con- 
taminant. If this process is unchecked, large amounts of glucose and other 
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Fig. 5. CTC fermentation process phages. 
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Fig. 7. Probability variation of a Bacillus-infected batch. Fig. 8. Probability variation of a phages-infected batch. 
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Finally, infection of the CTC fermentation process by phages (bacte- 
riophages) is illustrated in Fig. 8. A phage is a kind of virus that can infect 
and replicate within bacteria and is widely distributed in areas populat- 
ed by bacterial hosts. ‘Phage’ is also a general name for microbes and 
viruses causing infection of bacteria, fungi, actinomycetes, spirochetes, 
etc. If the culture process is infected by phages, common symptoms 
include a slow or sluggish pace of growth with the fermentation time 
and the final product is reduced to a great extent. Other indicators 
include reduced carbon dioxide in the exhaust, gradually lighter sub- 
strate, progressively increasing pH, dark color of broth instead of normal 
yellowish-brown, and large amounts of residual glucose. Fig. 8 describes 
a phage-infected batch, where probability of ‘Phage’ rapidly increases to 
60% from approximately 15-26 h of the culture process. Similar to 
identifying a Bacillus infection, the average probability of ‘Phage’ trajec- 
tory exceeding by 30% is considered as infection by phages, where a 30% 
probability is predefined as the lowest threshold and the trajectory of 
‘Normal’ is also less than 60%. 

Compared with a Bacillus infection, the control and preventive 
measures for phage infection are much more challenging, once the 
culture is infected by a virulent phage. Symptoms of a low-grade phage 
infection are often invisible, but the concentration of glucose will begin 
to rise with time keeping the feed rate constant, until a majority of cells 
are lysed within hours of infection resulting in the fermentation stopping. 
This phenomenon relies on several factors including the type of phages, 
the stage of fermentation occurring infection, the quantum of phages in 
contrast with that of its host, the substrate components, as well as the 
physical and chemical environment in the fermenter. Thus infections 
with the same phage may show diverse symptoms. 

In actual operation, the tendency of phage infection can be estimated 
through experience or with the aid of software, such as the proposed 
method in this study. By analyzing previous records of phage infections, 
we can draw the following conclusions: infections occurring in the seed 
culture phase may spread to all production fermenters, infections in the 
early phase of culture can make industry-scale fermentation process 
difficult, but if the infection occurs towards the end of the culture 
process, it generally does not exhibit any obvious symptoms. Methods 
for preventing and attenuating the harm caused by phage infection 
are still a subject for intensive research. Routine methods such as addi- 
tion of chelating agents, non-ionic detergents, and antibiotics may abate 
phage propagation, but when an infection occurs in a fed-batch due to 
poor equipment, the best choice is to stop the infected batch, discard 
all contaminated material, and conduct a thorough cleaning and steril- 
ization of equipment. Such actions must be undertaken even at the 
expense of closure of the entire plant. 

In some CTC fermentation plants, whether the culture process is 
contaminated previously depends on the judgment, with the help 
from two methods: microscopy and bacterial culture, both carried out 
offline, which is not only time-consuming and laborious, but also 
increases the risk of contamination accidents. The proposed method as 
an alternative strategy can predict the real-time state of contamination 
based on process data, thus overcoming the challenges of manual, 
offline tests. The prediction performance for Bacillus and phage contam- 
inants is listed in Table 1. We can see that the prediction performance 
for phage infection is clearly superior to that for Bacillus infection. This 
can be attributed to that phage infection itself displays characteristics 
of destructive force: short latent period and large burst scale, so the 
proposed method can easily capture the marked changes, while Bacillus 
infection is characterized by mild reaction and a gradual process. 


Table 1 
Prediction performance for contamination 


Term Accuracy rate/% False alarms/% Missing alarms/% 
Bacillus 61 11 28 
Phages 84 9 17 


Note: all data are collected from two plants for five month in 2013. 


Since the occurrence of contamination is impossible to eradicate 
completely, one hopes to minimize the impact of contamination on 
the yield and quality of products as well as the cost of production in 
the CTC fermentation process as a more viable solution. The controlled 
culture can be modified adequately based on the proposed method. 
Using the results of prediction, several corrective measures are possible 
depending on the time of infection. If the infection occurs at the begin- 
ning, corrective measures may be taken to curb the extent of contamina- 
tion, such as decreasing temperature, pH, feed rate, airflow rate or agitator 
rate, adding a moderate amount of antibiotic or similar sterilizing agent 
to the culture broth, or sterilizing the medium and re-inoculating the 
current batch. If the infection occurs in the steady stage, the above- 
mentioned measures as well as altering the culture environment and 
adding an antibacterial agent can be attempted to keep the culture run- 
ning, but the broth should be discharged earlier, once the above measures 
become invalid. If the infection occurs towards the end of the culture, 
nothing can be done, but observe the trend of infection. 

Compared to the performance of control, based on the conventional 
justification for contamination, the proposed method shows an improved 
prediction capability with enhanced economic benefit, by virtue of online 
predictions. Table 2 shows the results of comparison between the pro- 
posed and conventional methods, which are of average value. 


Table 2 
Comparison between new and conventional method 
Term Prediction time Predictiontime CTC yield/ug-ml~! Discarded 
for Bacillus/h for phages/h batch/% 
New method® 31.5 23.1 22,673 4.74 
Con. method® 41.3 29.7 21,354 6.13 


Note: all data are collected from two plants for five months in 2013. 
© New method is the proposed method in the context. 
®@ Con. method is the conventional method in the context. 


Several aspects of the proposed method using information fusion 
based on DSmT knowledge still have scope for improvement, for example, 
additional case records would increase the accuracy and stability of the 
system. 


4. Conclusions 


Formulation of a mechanistic model for CTC fermentation is challeng- 
ing, due to its intrinsic, nonlinear nature and time-dependent variability. 
At the same time, limited online methods are available to monitor 
contamination of culture process. Subsequently, infection control in CTC 
fermentation is equally challenging as that in large-scale plants. With 
the help of information fusion, based on DSmT, the tendency of contami- 
nation during CTC fermentation process can be predicted accurately in 
time, indirectly utilizing both measured and unmeasurable variables. 
Applied to the actual plant, results show that the proposed method can 
reduce the risk of infection, maintenance costs, as well as labor required, 
while improving the yield, quality, and economic efficiency. 
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